covariate shift assumption
Common Question Q1: The covariate shift assumption
We thank the reviewers for insightful and constructive comments. We have submitted code and detailed Appdendix . TransCal, it is inadvertently omitted by us while writing. Common Question Q2: Will TransCal have a lower accuracy while achieving a better calibration? TransCal maintains the same accuracy with that before calibration, while achieving a lower ECE (Figure 1(b)).
Domain Adaptive Decision Trees: Implications for Accuracy and Fairness
Alvarez, Jose M., Scott, Kristen M., Ruggieri, Salvatore, Berendt, Bettina
In uses of pre-trained machine learning models, it is a known issue that the target population in which the model is being deployed may not have been reflected in the source population with which the model was trained. This can result in a biased model when deployed, leading to a reduction in model performance. One risk is that, as the population changes, certain demographic groups will be under-served or otherwise disadvantaged by the model, even as they become more represented in the target population. The field of domain adaptation proposes techniques for a situation where label data for the target population does not exist, but some information about the target distribution does exist. In this paper we contribute to the domain adaptation literature by introducing domain-adaptive decision trees (DADT). We focus on decision trees given their growing popularity due to their interpretability and performance relative to other more complex models. With DADT we aim to improve the accuracy of models trained in a source domain (or training data) that differs from the target domain (or test data). We propose an in-processing step that adjusts the information gain split criterion with outside information corresponding to the distribution of the target population. We demonstrate DADT on real data and find that it improves accuracy over a standard decision tree when testing in a shifted target population. We also study the change in fairness under demographic parity and equal opportunity. Results show an improvement in fairness with the use of DADT.
- North America > United States > Illinois > Cook County > Chicago (0.05)
- Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
- Europe > Italy > Tuscany > Pisa Province > Pisa (0.04)
- (3 more...)
On the inductive biases of deep domain adaptation
Siry, Rodrigue, Hémadou, Louis, Simon, Loïc, Jurie, Frédéric
Domain alignment is currently the most prevalent solution to unsupervised domain-adaptation tasks and are often being presented as minimizers of some theoretical upper-bounds on risk in the target domain. However, further works revealed severe inadequacies between theory and practice: we consolidate this analysis and confirm that imposing domain invariance on features is neither necessary nor sufficient to obtain low target risk. We instead argue that successful deep domain adaptation rely largely on hidden inductive biases found in the common practice, such as model pre-training or design of encoder architecture. We perform various ablation experiments on popular benchmarks and our own synthetic transfers to illustrate their role in prototypical situations. To conclude our analysis, we propose to meta-learn parametric inductive biases to solve specific transfers and show their superior performance over handcrafted heuristics.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Asia > Middle East > Jordan (0.04)
Unsupervised Domain Adaptation with a Relaxed Covariate Shift Assumption
Adel, Tameem (University of Manchester) | Zhao, Han (Carnegie Mellon University) | Wong, Alexander (University of Waterloo)
The distributions can be different (Storkey and Sugiyama 2006; training and test domains are commonly referred to in the Ben-David and Urner 2012; 2014). Covariate shift is a valid domain adaptation literature as the source and target domains, assumption in some problems, but it can as well be quite respectively. Domain diversity can emerge as a result of the unrealistic for many other domain adaptation tasks where the scarcity of available labeled data from the target domain. It conditional label distributions are not (or, more precisely, not can as well be innate in the problem itself due to, for example, guaranteed to be) identical. The simplification resulting from an ongoing change occurring to the source domain like assuming identical labeling distributions facilitates the quest in cases where the original source domain keeps changing for a tractable learning algorithm, albeit possibly at the cost over time. Domain adaptation aims at finding solutions for of reducing the expressiveness power of the representation, this kind of problem, where the training (source) data are and consequently the accuracy of the resulting hypothesis.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)
- Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)
Analysis of Kernel Mean Matching under Covariate Shift
Yu, Yaoliang, Szepesvari, Csaba
In real supervised learning scenarios, it is not uncommon that the training and test sample follow different probability distributions, thus rendering the necessity to correct the sampling bias. Focusing on a particular covariate shift problem, we derive high probability confidence bounds for the kernel mean matching (KMM) estimator, whose convergence rate turns out to depend on some regularity measure of the regression function and also on some capacity measure of the kernel. By comparing KMM with the natural plug-in estimator, we establish the superiority of the former hence provide concrete evidence/understanding to the effectiveness of KMM under covariate shift.
- North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
- Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)